Journal: bioRxiv
Article Title: Engineering of CRISPR-Cas PAM recognition using deep learning of vast evolutionary data
doi: 10.1101/2025.01.06.631536
Figure Lengend Snippet: (a) Phylogenetic trees were built for Cas8, Cas9, and Cas12 proteins. Proteins were first clustered using MMseqs2 at 70% identity for Cas8 and Cas9 and at 95% identity for Cas12. Phylogenetic trees were built using FastTree and visualized using iToL . Colored strips indicate the information content at PAM positions. (b) Distribution of high-information content positions across PAMs from Type I, II, and V systems. In Type I systems, the PAM is predominantly restricted to positions −1 to −3 relative to the protospacer, while in Type II systems, the distribution of high information content PAM positions is more variable. (c) Distribution of the number of spacers aligned to virus and plasmid genomes for PAMs predictions from the CRISPR-Cas Atlas. (d) Signal-to-noise ratio comparing nucleotide conservation upstream and downstream of the protospacer for PAMs predictions from the CRISPR-Cas Atlas. In Type II systems, a downstream motif is expected, while in Type I and V systems, the motif is upstream. Bioinformatic PAM predictions are based on a high number of aligned CRISPR spacers, resulting in strong signal-to-noise ratios and providing a robust training dataset for Protein2PAM.
Article Snippet: Briefly, the PAM library plasmids were linearized with PvuI-HF (NEB).
Techniques: Virus, Plasmid Preparation, CRISPR